Nominal Variables
Against target
# Load necessary libraries
library(ggplot2)
library(gridExtra)
# Nominal variables in the dataset, excluding OnlineReservation, DeliveryOrder, and LoyaltyProgramMember
nominal_vars <- c("ServiceRating", "FoodRating", "AmbianceRating", "Gender",
"VisitFrequency", "PreferredCuisine", "TimeOfVisit",
"DiningOccasion", "MealType")
# Loop through each nominal variable to generate the plots and interpretations
for (nominal_var in nominal_vars) {
# Convert the nominal variable to a factor (if needed)
satisfaction[[nominal_var]] <- as.factor(satisfaction[[nominal_var]])
# Display variable title as a header
cat("\n##", nominal_var, " vs HighSatisfaction\n\n")
# Print table with margins for each nominal variable
print(addmargins(table(satisfaction[[nominal_var]], satisfaction$HighSatisfaction,
dnn = c(nominal_var, "HighSatisfaction"))))
# Create title strings with line breaks
count_title <- paste("Count of", nominal_var, "vs HighSatisfaction")
proportion_title <- paste("Proportion of", nominal_var, "vs HighSatisfaction")
# Barplot with fill based on HighSatisfaction (Count)
p1 <- ggplot(data = satisfaction) +
geom_bar(aes_string(x = nominal_var, fill = "HighSatisfaction")) +
scale_fill_manual(values = c("palevioletred1", "darkseagreen1")) +
labs(title = count_title, y = "Count") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))
# Print the first plot (count)
print(p1)
# Interpretation for the first graph (Count)
cat("\nInterpretation for Count Graph:\n")
cat("For", nominal_var, "the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on", nominal_var,
". Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of", nominal_var, ".\n\n")
# Stacked Barplot (with proportions)
p2 <- ggplot(data = satisfaction) +
geom_bar(aes_string(x = nominal_var, fill = "HighSatisfaction"), position = "fill") +
scale_fill_manual(values = c("palevioletred1", "darkseagreen1")) +
labs(title = proportion_title, y = "Proportion") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))
# Print the second plot (proportion)
print(p2)
# Interpretation for the proportion graph with variable-specific insights
cat("\nInterpretation for Proportion Graph:\n")
if (nominal_var == "ServiceRating") {
cat("- Higher service ratings (4 and 5) tend to have more customers with high satisfaction compared to lower ratings.\n")
cat("- The proportion plot shows a gradual increase in satisfaction as the ServiceRating increases.\n")
} else if (nominal_var == "FoodRating") {
cat("- Similar to ServiceRating, customers who rated the food highly (4 and 5) tend to be more satisfied.\n")
cat("- A higher proportion of customers who rated the food as '1' or '2' are less satisfied.\n")
} else if (nominal_var == "AmbianceRating") {
cat("- Higher ambiance ratings are associated with increased customer satisfaction.\n")
cat("- The proportion plot indicates that ambiance plays a role in customer satisfaction.\n")
} else if (nominal_var == "Gender") {
cat("- Satisfaction does not appear to vary significantly between genders.\n")
cat("- Both the count and proportion plots show a similar distribution for male and female customers.\n")
} else if (nominal_var == "VisitFrequency") {
cat("- Customers who visit more frequently tend to have higher satisfaction levels.\n")
cat("- Weekly visitors have the highest count of satisfied customers, as shown in the proportion plot.\n")
} else if (nominal_var == "PreferredCuisine") {
cat("- Preferences for cuisine show some little variation in satisfaction levels.\n")
cat("- Customers who prefer Indian or American cuisine appear to have a slightly higher proportion of satisfaction.\n")
} else if (nominal_var == "TimeOfVisit") {
cat("- Time of visit (Breakfast, Lunch, or Dinner) does not seem to have a strong impact on satisfaction.\n")
cat("- The proportion of satisfied customers remains relatively consistent across meal times.\n")
} else if (nominal_var == "DiningOccasion") {
cat("- Dining occasions like celebrations have a higher proportion of satisfied customers.\n")
cat("- Business and casual dining occasions show more mixed satisfaction levels.\n")
} else if (nominal_var == "MealType") {
cat("- The satisfaction levels for dine-in and takeaway customers are noticeable.\n")
cat("- The proportion of satisfied customers is higher for dine-in customers.\n")
}
cat("\n\n")
}
## ServiceRating vs HighSatisfaction
HighSatisfaction
ServiceRating 0 1 Sum
1 261 31 292
2 258 31 289
3 273 29 302
4 242 53 295
5 265 57 322
Sum 1299 201 1500

Interpretation for Count Graph:
For ServiceRating the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on ServiceRating . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of ServiceRating .

Interpretation for Proportion Graph:
- Higher service ratings (4 and 5) tend to have more customers with high satisfaction compared to lower ratings.
- The proportion plot shows a gradual increase in satisfaction as the ServiceRating increases.
## FoodRating vs HighSatisfaction
HighSatisfaction
FoodRating 0 1 Sum
1 282 31 313
2 245 29 274
3 296 19 315
4 247 53 300
5 229 69 298
Sum 1299 201 1500

Interpretation for Count Graph:
For FoodRating the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on FoodRating . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of FoodRating .

Interpretation for Proportion Graph:
- Similar to ServiceRating, customers who rated the food highly (4 and 5) tend to be more satisfied.
- A higher proportion of customers who rated the food as '1' or '2' are less satisfied.
## AmbianceRating vs HighSatisfaction
HighSatisfaction
AmbianceRating 0 1 Sum
1 285 39 324
2 270 28 298
3 243 25 268
4 236 57 293
5 265 52 317
Sum 1299 201 1500

Interpretation for Count Graph:
For AmbianceRating the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on AmbianceRating . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of AmbianceRating .

Interpretation for Proportion Graph:
- Higher ambiance ratings are associated with increased customer satisfaction.
- The proportion plot indicates that ambiance plays a role in customer satisfaction.
## Gender vs HighSatisfaction
HighSatisfaction
Gender 0 1 Sum
Female 659 100 759
Male 640 101 741
Sum 1299 201 1500

Interpretation for Count Graph:
For Gender the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on Gender . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of Gender .

Interpretation for Proportion Graph:
- Satisfaction does not appear to vary significantly between genders.
- Both the count and proportion plots show a similar distribution for male and female customers.
## VisitFrequency vs HighSatisfaction
HighSatisfaction
VisitFrequency 0 1 Sum
Daily 130 23 153
Monthly 394 34 428
Rarely 293 20 313
Weekly 482 124 606
Sum 1299 201 1500

Interpretation for Count Graph:
For VisitFrequency the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on VisitFrequency . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of VisitFrequency .

Interpretation for Proportion Graph:
- Customers who visit more frequently tend to have higher satisfaction levels.
- Weekly visitors have the highest count of satisfied customers, as shown in the proportion plot.
## PreferredCuisine vs HighSatisfaction
HighSatisfaction
PreferredCuisine 0 1 Sum
American 229 41 270
Chinese 268 42 310
Indian 253 43 296
Italian 285 40 325
Mexican 264 35 299
Sum 1299 201 1500

Interpretation for Count Graph:
For PreferredCuisine the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on PreferredCuisine . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of PreferredCuisine .

Interpretation for Proportion Graph:
- Preferences for cuisine show some little variation in satisfaction levels.
- Customers who prefer Indian or American cuisine appear to have a slightly higher proportion of satisfaction.
## TimeOfVisit vs HighSatisfaction
HighSatisfaction
TimeOfVisit 0 1 Sum
Breakfast 434 72 506
Dinner 425 67 492
Lunch 440 62 502
Sum 1299 201 1500

Interpretation for Count Graph:
For TimeOfVisit the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on TimeOfVisit . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of TimeOfVisit .

Interpretation for Proportion Graph:
- Time of visit (Breakfast, Lunch, or Dinner) does not seem to have a strong impact on satisfaction.
- The proportion of satisfied customers remains relatively consistent across meal times.
## DiningOccasion vs HighSatisfaction
HighSatisfaction
DiningOccasion 0 1 Sum
Business 453 47 500
Casual 428 53 481
Celebration 418 101 519
Sum 1299 201 1500

Interpretation for Count Graph:
For DiningOccasion the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on DiningOccasion . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of DiningOccasion .

Interpretation for Proportion Graph:
- Dining occasions like celebrations have a higher proportion of satisfied customers.
- Business and casual dining occasions show more mixed satisfaction levels.
## MealType vs HighSatisfaction
HighSatisfaction
MealType 0 1 Sum
Dine-in 613 138 751
Takeaway 686 63 749
Sum 1299 201 1500

Interpretation for Count Graph:
For MealType the barplot shows the distribution of customer satisfaction (0 = Unsatisfied, 1 = Satisfied) based on MealType . Key trends and observations can be noted, such as any obvious differences in satisfaction levels between categories of MealType .

Interpretation for Proportion Graph:
- The satisfaction levels for dine-in and takeaway customers are noticeable.
- The proportion of satisfied customers is higher for dine-in customers.
Correlations between
numerical variables
# Load necessary libraries
library(ggplot2)
library(gridExtra)
# Define numerical variables
numerical_vars <- c("Age", "Income", "AverageSpend", "GroupSize", "WaitTime")
# Loop through each variable to create plots and interpretations
for (var in numerical_vars) {
# Adjusted titles with line breaks
boxplot_title <- paste("Boxplot of", var, "\nby HighSatisfaction")
density_title <- paste("Density Plot of", var, "\nby HighSatisfaction")
# Boxplot for the variable by HighSatisfaction
boxplot <- ggplot(satisfaction, aes_string(x = "HighSatisfaction", y = var)) +
geom_boxplot(aes(fill = HighSatisfaction), outlier.colour = "red", outlier.size = 2) +
labs(title = boxplot_title, y = var, x = "HighSatisfaction") +
theme_minimal() +
scale_fill_manual(values = c("palevioletred1", "darkseagreen1")) +
theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold"))
# Density plot for the variable by HighSatisfaction
density_plot <- ggplot(satisfaction, aes_string(x = var, fill = "HighSatisfaction")) +
geom_density(alpha = 0.3) +
scale_fill_manual(values = c("palevioletred1", "darkseagreen1")) +
labs(title = density_title, x = var, y = "Density") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold"))
# Arrange the boxplot and density plot side by side, adjusting the size
grid.arrange(boxplot, density_plot, ncol = 2, widths = c(1.2, 1.2))
# Interpretation for the variable
if (var == "Age") {
cat("\n**Age**\n\n",
"- The Boxplot shows a similar distribution of ages among satisfied and unsatisfied customers. ",
"The median age is close for both groups. ",
"The density plot suggests a slightly higher density of younger satisfied customers (below 40), while unsatisfied customers are more evenly spread across ages.\n\n")
} else if (var == "Income") {
cat("\n**Income**\n\n",
"- The Boxplot indicates that both satisfied and unsatisfied groups have a similar income distribution. ",
"However, the Density Plot shows a higher density of satisfied customers around higher income brackets, suggesting that income might be a factor contributing to satisfaction.\n\n")
} else if (var == "AverageSpend") {
cat("\n**Average Spend**\n\n",
"- From the Boxplot, the median spending appears similar between both groups, ",
"but the density plot suggests that satisfied customers are more concentrated around higher spending ranges, ",
"whereas unsatisfied customers are more evenly distributed.\n\n")
} else if (var == "GroupSize") {
cat("\n**Group Size**\n\n",
"- The Boxplot shows that the median group size is similar for both satisfaction levels. ",
"However, the density plot reveals a slightly higher density of satisfied customers for smaller group sizes, ",
"whereas unsatisfied customers tend to have more variation in group sizes.\n\n")
} else if (var == "WaitTime") {
cat("\n**Wait Time**\n\n",
"- The Boxplot reveals that satisfied customers tend to have shorter wait times, as indicated by a lower median. ",
"The Density Plot supports this, showing a higher density of satisfied customers with wait times around 20 minutes or less, ",
"while longer wait times are associated with unsatisfied customers.\n\n")
}
}

**Age**
- The Boxplot shows a similar distribution of ages among satisfied and unsatisfied customers. The median age is close for both groups. The density plot suggests a slightly higher density of younger satisfied customers (below 40), while unsatisfied customers are more evenly spread across ages.

**Income**
- The Boxplot indicates that both satisfied and unsatisfied groups have a similar income distribution. However, the Density Plot shows a higher density of satisfied customers around higher income brackets, suggesting that income might be a factor contributing to satisfaction.

**Average Spend**
- From the Boxplot, the median spending appears similar between both groups, but the density plot suggests that satisfied customers are more concentrated around higher spending ranges, whereas unsatisfied customers are more evenly distributed.

**Group Size**
- The Boxplot shows that the median group size is similar for both satisfaction levels. However, the density plot reveals a slightly higher density of satisfied customers for smaller group sizes, whereas unsatisfied customers tend to have more variation in group sizes.

**Wait Time**
- The Boxplot reveals that satisfied customers tend to have shorter wait times, as indicated by a lower median. The Density Plot supports this, showing a higher density of satisfied customers with wait times around 20 minutes or less, while longer wait times are associated with unsatisfied customers.